BigDFT.Stats module
A module to describe information coming from ensemble averaging
- symmetrize_df(df1)[source]
From a dataframe that should be asymmetrix matrix, construct the symmetrized dataframe
- clean_dataframe(df, symmetrize=True)[source]
Symmetrize a dataframe and remove the NaN rows and columns
- Parameters:
df (Dataframe) –
symmetrize (bool) – symmetrize the dataframe if applicable
- Returns:
the cleaned dataframe
- Return type:
Dataframe
- stacked_dataframe(pop)[source]
Construct a stacked dataframe with all the data of the population
Warning
Weights are ignored, therefore the average value of such stacked dataframe may be different from the population mean.
- weighted_dataframe(dfs, wgts)[source]
Construct a single dataframe that include the provided weigths. Useful for all the situations where one wants to have a single view of a population which is weighted
- transverse_dataframe(population, target_features, target_samples)[source]
Contruct a transverse dataframe that gathers the data from some target features and samples of a population. This may be useful to show the variability of some data for particular entries
- concatenate_populations(populations, extra_labels={})[source]
Write a file containing the set of the populations we want to serialized
- safe_multiply_and_pow(data, a, pw)[source]
Perform a multiplication and power expansion that is resilient to dataframes that contain sequences that are not floating-point number compatible
- class ClusterGrammer(df)[source]
A class that facilitates the use of the clustergrammer objects :param df: the dataframe to represent the clustergrammer :type df: pandas.DataFrame
- represent_only(axis, elements)[source]
Represent only the elements that are indicated on the given axis